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SURVEY OF ONLINE ACCESS TO S0CIA1 SCIENCE DATA BASES 

By Robert Donati 
Lockheed Information Systems 
New York, N, Y. 

Abstract 

Until very recently there was little computer access to compre- 
hensive bibliographic data bases in the social sciences. Now online 
searching of several directly relevant files is made possible through 
services such as the Lockheed DIALOG ^ system. These data bases are 
briefly surveyed , with emphasis on content, structure, and strategy 
appropriate for online, interactive searching. Indexes discussed in 
the paper include Social Sciences Citation Index (R5 , Sociological 
Abstracts, Psychological Abstracts, Language and Language Behavior 
Abstracts, Historical Abstracts, ERIC, Exceptional Child Education 
Abstracts, Foundations Directory, Foundations Grants Index and others. 
Coverage of certain social science topics is quantitatively compared 
among several social science and more general files, including KTIS 
and Comprehensive Dissertations Abstracts Index. Techniques for online 
thesaurus utilization are described as are systematic application of 
the same strategy across files through a search save feature and the 
use of merged keyword and term indexes from several data bases. The 
relatively modest costs of such services are briefly analyzed. 



Given June 8, 1976 on the panel "Data Base Update: Innovations 
CP in Social Science Information Handling," during the Social Science 

Q~ Division session of the 67th Annual Convention of the Special 

Libraries Association, Denver, Colorado. 
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Social science researchers and librarians need no longer feel 

"left out ?f of the ragid developments which have occurred in computer*- 

based information systems* Within the last two years, many tools of 

information science have been applied to the social sciences so that 

a substantial amount of information is now mora widely accessible. 

For example, there are now eight social science bibliographic 

data bases on the Lockheed DIALOG "online retrieval service totalling 

approximately 1,070,000 document records as of early June, 1976. 

Furthermore, perhaps 25% of over 1,1 million additional records In 

six closely related files provide further in-depth information of 

some interest to the social scientist * Thus* over 1*3 million 

document records directly relevant to the social sciences are availabl 

- representing 10% of the 13 million records loaded on DIALOG • This 

percentage cannot be regarded as smalls given the extensive mtzm of 

the scientific literature and the long-istablished s government-funded 

programs in scientific and technical information. 

Just what data bases are now available? With respect to the 

social sciences, there are both multidiaciplinary files such as Social 

(R) 

Sciences Citation Index and Comprehensive Dissertation Abstracts 
Index, and a ninaber of discipline oriented files such as Psychological 
Abstracts, Sociological Abstracts* and Historical Abstracts, A 
complete list of those now available on the DIALOG service is given 
in Figure 1, There are also other data bases, such as ABI/INFORM and 
NTI5, which have high social science content. Many of the scientific 
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and business files also contain material of interest. For example, 

the field of linguistics is especially well covered in INSPEC 

Computers and Control Abstracts, 

The question of database selection for a particular search is, 

I believe, a less difficult choice than it often appears, There 

are now available at least three and frequently several more good 

choices for virtually any topic in the social sciences! (1) the 

appropriate functional databasej such as Language and Language/ 

Behavior Ab£tracts s (2) the only truly comprehensive social science 

(R) 

database - Social Sciences Citation Index * and (3) Comprehensive 
Dissertations Index, EMC is almost always a good fourth choice* 
especially where the teaching of the subject is involved. 

As an aid to data base selections I have prepared a rudimentary 
directory of subject content for the files directly concerned 
(Figure 2), The subject areas include the 25 social science topics 
chosen for the informal roundtable discussions at the meetings of the 
SLA Social Science Division and were augmented by a few additional 
topics* For each data base 1 rated coverage of each topic as 
high (H), moderate (M) , mall (S) s or minor/negligible, These 
opinions are my own based on a review of the subject classification 
schemes, journal coverages and my own experience with the ; files. 
Unfortunately, a lengthy study would be required to make a more 
rigorous quantification of content* I would welcome comments from 
librarians and abstracting and indexing services on the validity 
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of the chart, I believe, however, the table to be especially useful 
In steering the novice user to the more important data bases which 
cover a given topic, 

Which files need to be used depends, of course , on the purpose and 
scope of the search. Primary developments in the field might be 
covered by examining one or two discipline-oriented data bases. If 
government activity or sponsorship is involved, then NTXS should 
probably be considered as well, A question involving behavior of 
individuals, groups, organizations, should likely be searched in 
Psychological Abstracts, Information on current research may frequent- 
ly be found in Foundation Grants Index, if an extremely comprehensive 
bibliography is required, it may be necessary to search a half-dozen 
or more files. 

To obtain a more precise evaluation of coverage, a search was run 
on several key data bases to determine the number of document records 
in which eachphraffi occurred literally. The results (Figure 3) again 
show multiple data base coverage for many topics. 

Most of the data bases with which we are concerned are biblio- 
graphic, wherein the machine record is a surrogate for the original 
document containing at least title, authors and bibliographic 
citation. A review of the major data elements of the DIALOG social 
science data bases reveals a great similarity (Figure 4) in structure 
with but few significant exceptions. Most of the files also contain 

assigned subject indexes (controlled and uncontrolled), narrative 
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abstracts or annotations* and author affiliations. Some also have 
thesauri online and numerical classification schemes. 

There are also several files which are statistical or mixed 
bibliographic/statistical* Examples of the first are Foundations 
Directory and Foundations Grants Index; examples of the latter 
include the Fredicasts Market Abstracts, F & S Indexes, and Domestic/ 
International Statistics. Online access is rapidly increasing in 
data bases which are factual or whose records may be described as 
"informative" in contrast to the "descriptive" records of the 
bibliographic data bases. 

What effect does taowledge of record structure and content have on 
search comprehensiveness and precision? The obvious answer may be 
"plenty." While a good understanding of the files is desirable and 
usually necessary for high recall of documents and efficient use of 
time and resources s the advent of online s interactive searching has 
made it possible to do relative effective searching without this 
detailed knowledge. Why? — for several fundamental reasons: 

(1) the general ability to interact with results, 
changing search strategy as one proceeds, 

(2) natural language (or "full-teKt") searching allowing 
one to look for exact phrases within subject-indicating 
fields such as titles, indexes, abstracts, etc., 

(3) proximity and field specifications of terms, giving 
more precision than the simple logical intersection 
("and" operation). 
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(4) display of inverted indexes showing all terms actually 
used, thus guiding the user to alphabetically and 
conceptually near terms, 

(5) online thesauri which provide expanded cross references 
to topics, 

(6) truncation of word stems, obviating such problems as 
teenager, teenagers, teen-ager, teen ager, etc., 

(7) offline merged term indexes for several data bases by 
broad subject categories, 

(8) Search save capability whereto a search concept (e.g. 
women's liberation), or an entire search can be defined 
in full text fashion, saved in the computer, and applied 
to one or more data bases. 

Most of these copies are illustrated in the examples given in 
Figures 5 and 6, While a searcher should usually divide the topic 
into logically independent concepts prior to search time, it is really 
unnecessary always to make an exhaustive analysis of offline search 
aids if ttoe is short. It Is often a wise strategy to take a minimum 
amount of key input, make logical combinations and then output a few 
good records online, selecting the significant terms for reincorporn- 
tion into the strategy. 

The search shown in Figure 6 illustrates this point, The topic 
was "citizen participation in the operation of community facilities," 
The basic inverted index of the file selected (Psychological Abstracts) 
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Figure 5 
DIAL1ST™ 

Tenn Frequency indexes from the DIALOG Flies 
In Microfiche 



EXTRACT FROM MERGED-TERM INDEX 
SOCIAL SCIENCES GROUP 



ERIC PSYCH SOCIAL EXCEPT 

ABSTRACTS SC I SEARCH OH I LD 



crime 


414 


1173 


619 


31 


crime conviction 




20 






crimes 


37 


172 


(01 


9 


crlmina 1 


273 


731 


997 


38 


criminal law 


7 


48 






crlmina I Tty 


13 


108 


55 


6 


crlmfna Is 


109 


1081 


26 


29 
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Figure 6 
Samp la p IALQG ' Search 



EXPAND COMMUNITY FACILITIES 



Ref 


Index-Term "Type 


Items 


RT 


El 


COMMUNITIZED ■ 


1 




E2 


COMMUNITY 


6075 




E3 


COMMUNITY ATTITUDES- 


94 


1 


E4 


COMMUNITY COLLEGE STUDENTS— 


10 


2 


E5 


COMMUNITY COLLEGES — 


1 


E6 


-COMMUNITY FACILITIES — 


51 


15 


C/ 


, COMMUNITY MENTAL HEALTH 


76 


7 


E8 


COMMUNITY MENTAL HEALTH 







EXPAND E6 



Ref 


Index-Term 


Type 


Items 


RT 


R1 


COMMUNITY FACILITIES 




51 


15 


R2 


CHILD GUIDANCE CLINICS- 


R 


57 


7 


R3 


COMMUNITY MENTAL HEALTH 










CENTERS — 


N 


504 


12 


R4 


COMMUNITY SERVICES 


R 


1504 


8 


RJ 


DAY CARE CENTERS 


R 


26 


3 


R6 


HALFWAY HOUSES — 


R 


29 


6 


R7 


HOUSING — 


N 


,166 


2 


R8 


PUBLIC TRANSPORTATION 


N 


12 


4 


R9 


RECREATION AREAS 


R 


10 


4 


RIO 


REHAB I L 1 TAT 1 ON CENTERS 


R 


30 


3 


R11 


RELIGIOUS BUILDINGS — 


R 


4 

15 


R12 


SCHOOLS — 


R 


4049 


R13 


SHELTERED WORKSHOPS — » 


R 


55 


3 


R14 


SHOPPING CENTERS 


N 


4 


3 


R15 


SUICIDE PREVENTION CENTERS- 


N 


30 


6 


R16 


URBAN PLANNING— 


R 


20 


5 


SELECT 


R1-R6, R9, RIO, R13. R15 


• 







1 2063 R1-R6, R9» RIO, R13, R15 

R1: COMMUNITY FACILITIES 

SELECT CITIZEN(W3PARTiC|PATIQN 

2 20 CITIZEN(W)PARTICIPATION 
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Figure 6 ,..<cont f 4) 



COMBINE 1 AND 2 

5 11 1 AND 2 
TYPE 5/6/1-5 

!* DOC YEAR: 1976 VOL NO: 55 ABSTRACT NO: 20739 

A Study of Citizen Participation In a Commun ity Mental Health 
Center. 

2. DOC YEAR* 1976 VOL NO: 55 ABSTRACT MO: 02118 

Citizen Participation In Decisionmaking: Myth or Strategy? 

3. DOC YEAR: 1975 VOL NO: 53 ABS~: . .ACT NO: 08054 

Citizen Participation and Conflict, 

4. DOC YEAR: 1975 VOL NO: 53 ABSTRACT NO: 02956 

Advocates For Themselves: Citizen Participation In Federal iy 
Supported Community Organizations, 

TYPE 5/5/4 

DOC YEAR: 1975 VOL NO: 53 ABSTRACT NO: 02956 

Advocates For Themselves: Citizen Pirtlcl pat Ion In Federally 
Supported Community Organizations 

Mogu lof , Melvin 8* 

Urban Inst, Washington DC 

Community Mental Health Journal 1974 Spr Vol 10(1 ) 66-76 
Discusses variations In the intensity of citizen participation In 
community organizations and variations in the decision structures for 
participation (e-g., advisory mechan isms or citizen control ) « It Is 
concluded that although control mechanisms may have certain negative 
consequences for racial Integration, citizen participation should be 
viewed as a policy goal as well as an Instrument for achieving other 
goa I s * 

CLASSIFICATION- 09 

SUBJECT TERMS- COMMUNITY SERVICES, PARTICIPATION: 10690, 36810 
INDEX PHRASE-CItizen Participation, Community Organizations 

SELECT PARTICIPATION 

4 2102 PARTICIPATION 
SELECT CITIZEN(W) CONTROL 

5 2 CITiZENCW)C0NTR0L 
COMBINE 1 AND (4 OR 5) NOT 3 

6 81 1 AND (4 OR 5} NOT 3 
PRINT 3/5| PRINT 6/5/1-81 

END 
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was browsed for "community facilities." This term was used and the 
computer indicated there wire 51 documents posted to it* and IS cross 
references* These were then called up and ten appropriate terms were 
selected, yielding a f, set ,f of 2,063 documents. The exact phrase 
'■citizen participation" was then selected (from the entire data base) 
and combined with the first group on community facilities. A few 
titles were typed online, One of the "hits" was reviewed and the 
subject term "partipation" was noticed as well as the phrase "citizen 
control/ 1 These were then selected, combined with "community 
facilities" and the additional hits identified, All the hits were 
then printed in an offline bibliography. This search cost $7*56 for 
online search time and offline printing for full records of the 92 
hits would have been $9. 20. 

Another example of surprisingly modest costs for online searching 
ie given in Figure 7 t Here s a "femin ism/women f s rights" concept was 
defined, using eight exact phrases and the truncated stem "feminis— ! \ 
The strategy was initially defined in Social Sciences Citation Index ^ 
and then stored away, Five and cae quarter minutes were required for 
this operation, The search was then recalled and executed in each of 
five other files, A total of 1,466 hits were obtained in all six 
files (the degree of duplication is unknown but is probably around 
15X), requiring a total of 18 minutes terminal time, costing $17*17 in 
cotQputer and data communication costs. 

In assessing the cost/benefits of online starching, one ought 
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( Figure 7 1 
SEARCH SAVE EXAMPLE 
J ' Search Defln it-iem and Saving 

A general 'Wnism/womens rights" concepts was defined in the 
Social Scisearch^ data base U8lng naturfll language fey ^ ^ 
combining (i n the "or" sense) aU records contalnlng ^ or ^ ^ ^ 
following phrases i 



feminis (truncated form 


for feminism, feminist, feminists, 


etc .) 




women's rights 


women* i 


5 liberation 






votsens rights 


wo ens 


liberation 






women's studies 


women's 


i lib 






vomens studies 


wonens 


lib 






II, Execution of Search on 


Several Data Ba: 


ses 


Search Time 


k 


Data Base No. of Hits 


Time (Mins.l 


Cost* 


\ 


7 Social Sciaearch 


205 


5.22 


$ 6.79 


: 


11 Psychological Abstracts 


159 


2.77 


2.68 




1 ERIC 


928 


6.15 


3.38 


i * 


15 AB I /INFORM 


32 


1.13 


1.37 


I; 


27 Foundation Grants Index 


35 


1 .05 


1.19 


! 


35 Compr. Dissert. Index 


107 


_1 ,68 


1 .76 




Totals 


1,468** 


18,00 


$17.17 




* Covers data base rate and 


data conmuiiications network. 




1'" 



Excludes offline printing. 
** This is the total with duplicate citations. 
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to consider the costs per typical searches rather than search rates* 
Most searches cost the user somewhere between $2.00 and $20.00. 
These figures include all costs paid to the retrieval service vendor 
(computer tirae^ data communieations, and ofi'iiafe printing) but 
exclude terminal rental t telephone cost (if any) to reach a network 
number, and library personnel time. The average search time 
determined from hundreds of thousands of searches is currently 
running 10 minutes although many searches are done in a few minutes 
and some may require 20 or more minutes* The average number of 
offline prints is 24, Using these statistical averages, the average 
search cost in certain data bases is as follows: 

ERIC - $7,90 PSYCH ABSTRACTS - $12,07 

OTIS - $9.57 SOCIAL SCISEARCH - $15.40 

The savings of many man hours to perform such searches manually 
(or even the impossibility of conducting such searches manually) 
appear to be substantial when compared to the relatively modest 
costs of searching indicated above. 

In conclusion, it is interesting to observe that in this year 
which marks the centennial of the telephone and roughly the tenth 
anniversary of online interactive retrieval f that social scientists 
and librarians now have at their disposal the mutually harnessed 
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technology of both conmunieations and computers including 

* a large number of discipline oriented files 
several good multi-disciplinary data bases 
online and offline vocabulary identification aids 
and 

* cross database search capabilities * 

It is hoped that usage will grow so that future developments in 
database coordination and retrieval system capabilities can be 
baaed on extensive experience of social science librarians and 
their patrons. 
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